AITopics | finite set

Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

Neural Information Processing SystemsMar-16-2026, 22:29:17 GMT

Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-problem in planning, game tree search and reinforcement learning. We formalize this learning task as the problem of sequentially testing how the minimum mean among a finite set of distributions compares to a given threshold. We develop refined non-asymptotic lower bounds, which show that optimality mandates very different sampling behavior for a low vs high true minimum. We show that Thompson Sampling and the intuitive Lower Confidence Bounds policy each nail only one of these cases. We develop a novel approach that we call Murphy Sampling. Even though it entertains exclusively low true minima, we prove that MS is optimal for both possibilities. We then design advanced self-normalized deviation inequalities, fueling more aggressive stopping rules. We complement our theoretical guarantees by experiments showing that MS works best in practice.

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)

Add feedback

b4572f47b7c69e27b8e46646d9579e67-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 08:00:01 GMT

surrogate, surrogate loss, surrogate regret, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Country:

North America > United States (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Neural Information Processing SystemsDec-27-2025, 15:04:17 GMT

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-arm bandit framework, where the objective is to minimize the cumulative regret over a sequence of tasks by incrementally transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for the estimation of the possible tasks and derive regret bounds for it.

multi-armed bandit, name change, sequential transfer, (2 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.42)

Add feedback

Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

Neural Information Processing SystemsNov-20-2025, 22:28:26 GMT

Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-problem in planning, game tree search and reinforcement learning. We formalize this learning task as the problem of sequentially testing how the minimum mean among a finite set of distributions compares to a given threshold. We develop refined non-asymptotic lower bounds, which show that optimality mandates very different sampling behavior for a low vs high true minimum. We show that Thompson Sampling and the intuitive Lower Confidence Bounds policy each nail only one of these cases. We develop a novel approach that we call Murphy Sampling. Even though it entertains exclusively low true minima, we prove that MS is optimal for both possibilities. We then design advanced self-normalized deviation inequalities, fueling more aggressive stopping rules. We complement our theoretical guarantees by experiments showing that MS works best in practice.

lowest mean, murphy sampling, sequential test, (4 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)

Add feedback

From Ethical Declarations to Provable Independence: An Ontology-Driven Optimal-Transport Framework for Certifiably Fair AI Systems

Bhattacharya, Sukriti, Majumdar, Chitro

arXiv.org Artificial IntelligenceOct-10-2025

This paper presents a framework for provably fair AI that overcomes the limits of current bias mitigation methods by systematically removing all sensitive information and its proxies. Using ontology engineering in OWL 2 QL, it formally defines sensitive attributes and infers their proxies through logical reasoning, constructing a sigma algebra G that captures the full structure of biased patterns. Fair representations are then obtained via Delbaen Majumdar optimal transport, which generates variables independent of G while minimizing L2 distance to preserve accuracy. This guarantees true independence rather than mere decorrelation. By modeling bias as dependence between sigma algebras, compiling ontological knowledge into measurable structures, and using optimal transport as the unique fair transformation, the approach ensures complete fairness in tasks like loan approval, where proxies such as ZIP code reveal race. The result is a certifiable and mathematically grounded method for trustworthy AI.

artificial intelligence, information, proxy, (18 more...)

arXiv.org Artificial Intelligence

2510.08086

Country: Europe (0.46)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Health & Medicine (0.68)
Banking & Finance > Insurance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Factor Graph Grammars

Neural Information Processing SystemsOct-2-2025, 20:33:29 GMT

Moreover, inference can be done on FGGs without enumerating all the generated factor graphs.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Neural Information Processing SystemsSep-30-2025, 11:02:06 GMT

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-arm bandit framework, where the objective is to minimize the cumulative regret over a sequence of tasks by incrementally transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for the estimation of the possible tasks and derive regret bounds for it.

artificial intelligence, data mining, machine learning, (6 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting (0.62)

Technology: